A Specialised Verb Lexicon as the Basis of Fact Extraction in the Biomedical Domain

نویسندگان

  • C. J. Rupp
  • Paul Thompson
  • William Black
  • John McNaught
چکیده

The BioLexicon is a standardised, reusable, lexical and conceptual resource suitable for advanced biomedical text mining. One of the unique features of the BioLexicon is the incorporation of rich syntactic and semantic patterns for a wide range of domain-relevant verbs, which have been acquired semiautomatically from biomedical corpora. Such types of information can be highly beneficial for information and fact extraction applications. In this paper, we describe the collection of the verb-specific information for inclusion in the BioLexicon, and explain how it is being employed in a specific scenario (the UKPMC project) to leverage fact-based information extraction on a large collection of biomedical papers.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Bootstrapping a Verb Lexicon for Biomedical Information Extraction

The extraction of information from texts requires resources that contain both syntactic and semantic properties of lexical units. As the use of language in specialized domains, such as biology, can be very different to the general domain, there is a need for domain-specific resources to ensure that the information extracted is as accurate as possible. We are building a large-scale lexical resou...

متن کامل

Adaptive Dictionary for Bilingual Lexicon Extraction from Comparable Corpora

One of the main resources used for the task of bilingual lexicon extraction from comparable corpora is : the bilingual dictionary, which is considered as a bridge between two languages. However, no particular attention has been given to this lexicon, except its coverage, and the fact that it can be issued from the general language, the specialised one, or a mix of both. In this paper, we want t...

متن کامل

A Supervised Method for Constructing Sentiment Lexicon in Persian Language

Due to the increasing growth of digital content on the internet and social media, sentiment analysis problem is one of the emerging fields. This problem deals with information extraction and knowledge discovery from textual data using natural language processing has attracted the attention of many researchers. Construction of sentiment lexicon as a valuable language resource is a one of the imp...

متن کامل

Verbs in specialised corpora: from manual corpus-based description to automatic extraction in an English-French parallel corpus

This paper tackles the issue of verbs in specialised corpora in the view of term extraction. Corpus-based manual descriptions to be used in various applications have highlighted the “deviant” uses of verbs in specialised corpora compared with general uses as well as the need for verb extraction. However, very few attention has been given to verbs both in the terminology theory and automatic ter...

متن کامل

Development of a Greek biomedical corpus

Collection and annotation of specialized corpora, for less-spoken languages such as Greek, is crucial endeavour for the development and growth of the language technology research for these languages. This paper presents the design and compilation of a biomedical corpus that took place in the framework of the national R&D project “IATROLEXI” (http://www.iatrolexi.gr). The aim of IATROLEXI is to ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2010